Skip to content

Support InternVL3.5-Flash#3952

Merged
lvhan028 merged 8 commits intoInternLM:mainfrom
CUHKSZzxy:internvl-flash
Sep 17, 2025
Merged

Support InternVL3.5-Flash#3952
lvhan028 merged 8 commits intoInternLM:mainfrom
CUHKSZzxy:internvl-flash

Conversation

@CUHKSZzxy
Copy link
Copy Markdown
Collaborator

@CUHKSZzxy CUHKSZzxy commented Sep 9, 2025

Performance & Acc

Test with VLMEvalKit, dataset: ORCBench

Model Time Acc
InternVL3.5-8B ~120s 85.2
InternVL3.5-8B-Flash ~115s 84.9
InternVL3.5-FLash acc details

{
"Text Recognition": 249,
"Scene Text-centric VQA": 181,
"Doc-oriented VQA": 165,
"Key Information Extraction": 181,
"Handwritten Mathematical Expression Recognition": 73,
"Final Score": 849,
"Final Score Norm": 84.9
}

Related

@lvhan028 lvhan028 added the enhancement New feature or request label Sep 10, 2025

def mlp_block(in_dim, out_dim):
return nn.Sequential(
nn.Linear(in_dim, out_dim),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for not using build_rowwise_linear and build_colwise_linear?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/InternLM/lmdeploy/blob/main/lmdeploy/pytorch/models/internvl.py#L368

We can. But similar to the above code, as the vision encoding is only executed for once, and also for simplicity, I do not replace it with a custom op.

If you think it's worthwhile to do, I will replace them

@lvhan028 lvhan028 requested a review from grimoire September 10, 2025 04:20
@lvhan028
Copy link
Copy Markdown
Collaborator

lvhan028 commented Sep 17, 2025

@grimoire may resolve the following warning in PR #3922

/nvme1/lvhan/lmdeploy/lmdeploy/pytorch/kernels/cuda/flatten_kv_cache.py:77: UserWarning: Logical operators 'and' and 'or' are deprecated for non-scalar tensors; please use '&' or '|' instead
/nvme1/lvhan/lmdeploy/lmdeploy/pytorch/kernels/cuda/flatten_kv_cache.py:80: UserWarning: Logical operators 'and' and 'or' are deprecated for non-scalar tensors; please use '&' or '|' instead

@lvhan028 lvhan028 merged commit 8e0d680 into InternLM:main Sep 17, 2025
5 checks passed
@CUHKSZzxy CUHKSZzxy deleted the internvl-flash branch September 24, 2025 13:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants